Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells99
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory859.5 KiB
Average record size in memory88.0 B

Variable types

Numeric10
Categorical1

Alerts

longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
median_income is highly correlated with median_house_valueHigh correlation
median_house_value is highly correlated with median_incomeHigh correlation
longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
median_income is highly correlated with median_house_valueHigh correlation
median_house_value is highly correlated with median_incomeHigh correlation
longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
df_index is highly correlated with longitude and 3 other fieldsHigh correlation
longitude is highly correlated with df_index and 3 other fieldsHigh correlation
latitude is highly correlated with df_index and 3 other fieldsHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
median_income is highly correlated with median_house_valueHigh correlation
median_house_value is highly correlated with df_index and 4 other fieldsHigh correlation
ocean_proximity is highly correlated with df_index and 3 other fieldsHigh correlation
df_index has unique values Unique

Reproduction

Analysis started2022-03-15 18:09:30.378533
Analysis finished2022-03-15 18:09:39.540449
Duration9.16 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10314.8348
Minimum0
Maximum20639
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:39.604714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile980.95
Q15114.25
median10277
Q315532.25
95-th percentile19602.05
Maximum20639
Range20639
Interquartile range (IQR)10418

Descriptive statistics

Standard deviation5983.435059
Coefficient of variation (CV)0.5800805514
Kurtosis-1.205372537
Mean10314.8348
Median Absolute Deviation (MAD)5210.5
Skewness-0.002737784802
Sum103148348
Variance35801495.1
MonotonicityNot monotonic
2022-03-15T23:39:39.703820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61411
 
< 0.1%
96141
 
< 0.1%
55281
 
< 0.1%
116711
 
< 0.1%
96221
 
< 0.1%
34751
 
< 0.1%
178101
 
< 0.1%
55201
 
< 0.1%
116631
 
< 0.1%
157571
 
< 0.1%
Other values (9990)9990
99.9%
ValueCountFrequency (%)
01
< 0.1%
31
< 0.1%
41
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
111
< 0.1%
121
< 0.1%
131
< 0.1%
151
< 0.1%
ValueCountFrequency (%)
206391
< 0.1%
206371
< 0.1%
206321
< 0.1%
206311
< 0.1%
206301
< 0.1%
206291
< 0.1%
206251
< 0.1%
206231
< 0.1%
206201
< 0.1%
206191
< 0.1%

longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct766
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-119.599309
Minimum-124.35
Maximum-114.31
Zeros0
Zeros (%)0.0%
Negative10000
Negative (%)100.0%
Memory size78.2 KiB
2022-03-15T23:39:39.799896image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-124.35
5-th percentile-122.46
Q1-121.82
median-118.55
Q3-118.02
95-th percentile-117.08
Maximum-114.31
Range10.04
Interquartile range (IQR)3.8

Descriptive statistics

Standard deviation2.00345197
Coefficient of variation (CV)-0.01675136743
Kurtosis-1.360759623
Mean-119.599309
Median Absolute Deviation (MAD)1.32
Skewness-0.2628071996
Sum-1195993.09
Variance4.013819794
MonotonicityNot monotonic
2022-03-15T23:39:39.903974image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-118.2980
 
0.8%
-118.378
 
0.8%
-118.3172
 
0.7%
-118.3570
 
0.7%
-118.2869
 
0.7%
-118.3668
 
0.7%
-118.3467
 
0.7%
-118.3267
 
0.7%
-118.1464
 
0.6%
-118.1363
 
0.6%
Other values (756)9302
93.0%
ValueCountFrequency (%)
-124.351
 
< 0.1%
-124.31
 
< 0.1%
-124.271
 
< 0.1%
-124.251
 
< 0.1%
-124.231
 
< 0.1%
-124.221
 
< 0.1%
-124.213
< 0.1%
-124.191
 
< 0.1%
-124.184
< 0.1%
-124.177
0.1%
ValueCountFrequency (%)
-114.311
< 0.1%
-114.471
< 0.1%
-114.491
< 0.1%
-114.551
< 0.1%
-114.561
< 0.1%
-114.571
< 0.1%
-114.591
< 0.1%
-114.62
< 0.1%
-114.621
< 0.1%
-114.631
< 0.1%

latitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct794
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.658314
Minimum32.55
Maximum41.95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:40.025957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum32.55
5-th percentile32.83
Q133.94
median34.28
Q337.73
95-th percentile38.98
Maximum41.95
Range9.4
Interquartile range (IQR)3.79

Descriptive statistics

Standard deviation2.137125726
Coefficient of variation (CV)0.0599334485
Kurtosis-1.138330753
Mean35.658314
Median Absolute Deviation (MAD)1.29
Skewness0.4441786788
Sum356583.14
Variance4.567306368
MonotonicityNot monotonic
2022-03-15T23:39:40.126161image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.06125
 
1.2%
34.02110
 
1.1%
34.08109
 
1.1%
34.05108
 
1.1%
34.04107
 
1.1%
34.07107
 
1.1%
33.9799
 
1.0%
34.0999
 
1.0%
33.9498
 
1.0%
34.0394
 
0.9%
Other values (784)8944
89.4%
ValueCountFrequency (%)
32.553
 
< 0.1%
32.565
0.1%
32.576
0.1%
32.5810
0.1%
32.594
 
< 0.1%
32.62
 
< 0.1%
32.613
 
< 0.1%
32.628
0.1%
32.6310
0.1%
32.6411
0.1%
ValueCountFrequency (%)
41.951
< 0.1%
41.921
< 0.1%
41.881
< 0.1%
41.861
< 0.1%
41.841
< 0.1%
41.811
< 0.1%
41.81
< 0.1%
41.791
< 0.1%
41.782
< 0.1%
41.771
< 0.1%

housing_median_age
Real number (ℝ≥0)

Distinct52
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.5256
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:40.223027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q118
median28
Q337
95-th percentile52
Maximum52
Range51
Interquartile range (IQR)19

Descriptive statistics

Standard deviation12.6589482
Coefficient of variation (CV)0.4437750021
Kurtosis-0.8019724061
Mean28.5256
Median Absolute Deviation (MAD)10
Skewness0.07901039379
Sum285256
Variance160.2489695
MonotonicityNot monotonic
2022-03-15T23:39:40.323699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
52648
 
6.5%
36385
 
3.9%
35381
 
3.8%
16370
 
3.7%
17354
 
3.5%
34338
 
3.4%
33313
 
3.1%
26292
 
2.9%
25278
 
2.8%
18274
 
2.7%
Other values (42)6367
63.7%
ValueCountFrequency (%)
14
 
< 0.1%
227
 
0.3%
328
 
0.3%
498
1.0%
5115
1.1%
677
0.8%
785
0.9%
8112
1.1%
9110
1.1%
10136
1.4%
ValueCountFrequency (%)
52648
6.5%
5127
 
0.3%
5052
 
0.5%
4957
 
0.6%
4897
 
1.0%
4799
 
1.0%
46118
 
1.2%
45131
 
1.3%
44169
 
1.7%
43165
 
1.7%

total_rooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4504
Distinct (%)45.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2639.8222
Minimum6
Maximum37937
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:40.425825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile619.95
Q11440
median2123.5
Q33155
95-th percentile6200.05
Maximum37937
Range37931
Interquartile range (IQR)1715

Descriptive statistics

Standard deviation2207.480098
Coefficient of variation (CV)0.836223022
Kurtosis34.9027477
Mean2639.8222
Median Absolute Deviation (MAD)799.5
Skewness4.293792926
Sum26398222
Variance4872968.381
MonotonicityNot monotonic
2022-03-15T23:39:40.518767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
161311
 
0.1%
205310
 
0.1%
15629
 
0.1%
13879
 
0.1%
17299
 
0.1%
19999
 
0.1%
18349
 
0.1%
15489
 
0.1%
17229
 
0.1%
14719
 
0.1%
Other values (4494)9907
99.1%
ValueCountFrequency (%)
61
 
< 0.1%
111
 
< 0.1%
121
 
< 0.1%
151
 
< 0.1%
161
 
< 0.1%
183
< 0.1%
192
< 0.1%
201
 
< 0.1%
211
 
< 0.1%
251
 
< 0.1%
ValueCountFrequency (%)
379371
< 0.1%
326271
< 0.1%
320541
< 0.1%
304051
< 0.1%
304011
< 0.1%
277001
< 0.1%
263221
< 0.1%
259571
< 0.1%
239151
< 0.1%
238661
< 0.1%

total_bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1582
Distinct (%)16.0%
Missing99
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean537.3978386
Minimum2
Maximum6445
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:40.619200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile136
Q1294
median434
Q3647
95-th percentile1259
Maximum6445
Range6443
Interquartile range (IQR)353

Descriptive statistics

Standard deviation424.5896795
Coefficient of variation (CV)0.790084457
Kurtosis23.70419168
Mean537.3978386
Median Absolute Deviation (MAD)162
Skewness3.583189769
Sum5320776
Variance180276.396
MonotonicityNot monotonic
2022-03-15T23:39:40.712407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28030
 
0.3%
28730
 
0.3%
30030
 
0.3%
41627
 
0.3%
36127
 
0.3%
46027
 
0.3%
25427
 
0.3%
27226
 
0.3%
28926
 
0.3%
29125
 
0.2%
Other values (1572)9626
96.3%
(Missing)99
 
1.0%
ValueCountFrequency (%)
21
 
< 0.1%
34
< 0.1%
45
0.1%
54
< 0.1%
61
 
< 0.1%
72
 
< 0.1%
87
0.1%
93
< 0.1%
104
< 0.1%
114
< 0.1%
ValueCountFrequency (%)
64451
< 0.1%
54711
< 0.1%
52901
< 0.1%
49571
< 0.1%
49521
< 0.1%
47981
< 0.1%
45851
< 0.1%
44921
< 0.1%
44071
< 0.1%
43351
< 0.1%

population
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3077
Distinct (%)30.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1422.4701
Minimum5
Maximum28566
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:40.903632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile349
Q1781
median1164
Q31728
95-th percentile3269.45
Maximum28566
Range28561
Interquartile range (IQR)947

Descriptive statistics

Standard deviation1128.547999
Coefficient of variation (CV)0.7933720357
Kurtosis53.22359773
Mean1422.4701
Median Absolute Deviation (MAD)444
Skewness4.515762475
Sum14224701
Variance1273620.586
MonotonicityNot monotonic
2022-03-15T23:39:40.992723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99914
 
0.1%
89114
 
0.1%
82513
 
0.1%
76113
 
0.1%
78412
 
0.1%
99012
 
0.1%
89912
 
0.1%
74612
 
0.1%
98412
 
0.1%
104712
 
0.1%
Other values (3067)9874
98.7%
ValueCountFrequency (%)
51
 
< 0.1%
84
< 0.1%
92
< 0.1%
132
< 0.1%
151
 
< 0.1%
172
< 0.1%
181
 
< 0.1%
191
 
< 0.1%
201
 
< 0.1%
211
 
< 0.1%
ValueCountFrequency (%)
285661
< 0.1%
161221
< 0.1%
155071
< 0.1%
150371
< 0.1%
132511
< 0.1%
128731
< 0.1%
121531
< 0.1%
119731
< 0.1%
112721
< 0.1%
108771
< 0.1%

households
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1487
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean499.323
Minimum2
Maximum6082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:41.088691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile124
Q1277
median409
Q3605
95-th percentile1143.1
Maximum6082
Range6080
Interquartile range (IQR)328

Descriptive statistics

Standard deviation386.1194174
Coefficient of variation (CV)0.7732858639
Kurtosis24.88322346
Mean499.323
Median Absolute Deviation (MAD)152
Skewness3.59569574
Sum4993230
Variance149088.2045
MonotonicityNot monotonic
2022-03-15T23:39:41.183122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30930
 
0.3%
36229
 
0.3%
42829
 
0.3%
23928
 
0.3%
30628
 
0.3%
37528
 
0.3%
26728
 
0.3%
26527
 
0.3%
27427
 
0.3%
36927
 
0.3%
Other values (1477)9719
97.2%
ValueCountFrequency (%)
21
 
< 0.1%
33
< 0.1%
43
< 0.1%
53
< 0.1%
63
< 0.1%
74
< 0.1%
85
0.1%
96
0.1%
102
 
< 0.1%
112
 
< 0.1%
ValueCountFrequency (%)
60821
< 0.1%
51891
< 0.1%
50501
< 0.1%
46161
< 0.1%
44901
< 0.1%
43721
< 0.1%
43391
< 0.1%
41761
< 0.1%
40721
< 0.1%
40121
< 0.1%

median_income
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7329
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.88756159
Minimum0.4999
Maximum15.0001
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:41.283352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.4999
5-th percentile1.6007
Q12.580775
median3.5417
Q34.7708
95-th percentile7.352105
Maximum15.0001
Range14.5002
Interquartile range (IQR)2.190025

Descriptive statistics

Standard deviation1.914270559
Coefficient of variation (CV)0.4924090629
Kurtosis5.227155227
Mean3.88756159
Median Absolute Deviation (MAD)1.0676
Skewness1.68056668
Sum38875.6159
Variance3.664431774
MonotonicityNot monotonic
2022-03-15T23:39:41.377854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.000131
 
0.3%
4.12525
 
0.2%
3.87524
 
0.2%
3.12523
 
0.2%
2.87523
 
0.2%
420
 
0.2%
2.12519
 
0.2%
4.62519
 
0.2%
3.2517
 
0.2%
3.37516
 
0.2%
Other values (7319)9783
97.8%
ValueCountFrequency (%)
0.49997
0.1%
0.5364
< 0.1%
0.54951
 
< 0.1%
0.64331
 
< 0.1%
0.68251
 
< 0.1%
0.6961
 
< 0.1%
0.70071
 
< 0.1%
0.70251
 
< 0.1%
0.70681
 
< 0.1%
0.70691
 
< 0.1%
ValueCountFrequency (%)
15.000131
0.3%
151
 
< 0.1%
14.58331
 
< 0.1%
13.9471
 
< 0.1%
13.80931
 
< 0.1%
13.57281
 
< 0.1%
13.3671
 
< 0.1%
13.29861
 
< 0.1%
13.18671
 
< 0.1%
13.17381
 
< 0.1%

median_house_value
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3228
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean207235.5953
Minimum14999
Maximum500001
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-03-15T23:39:41.475334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum14999
5-th percentile67000
Q1121075
median179750
Q3265100
95-th percentile481570
Maximum500001
Range485002
Interquartile range (IQR)144025

Descriptive statistics

Standard deviation114892.0616
Coefficient of variation (CV)0.5544031247
Kurtosis0.3219888047
Mean207235.5953
Median Absolute Deviation (MAD)68150
Skewness0.9734128898
Sum2072355953
Variance1.320018582 × 1010
MonotonicityNot monotonic
2022-03-15T23:39:41.570107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500001456
 
4.6%
13750062
 
0.6%
16250058
 
0.6%
11250048
 
0.5%
8750040
 
0.4%
18750040
 
0.4%
22500038
 
0.4%
27500037
 
0.4%
35000036
 
0.4%
17500031
 
0.3%
Other values (3218)9154
91.5%
ValueCountFrequency (%)
149993
< 0.1%
175001
 
< 0.1%
225003
< 0.1%
250001
 
< 0.1%
269001
 
< 0.1%
275001
 
< 0.1%
300002
< 0.1%
325001
 
< 0.1%
329001
 
< 0.1%
332001
 
< 0.1%
ValueCountFrequency (%)
500001456
4.6%
5000009
 
0.1%
4991001
 
< 0.1%
4988001
 
< 0.1%
4987001
 
< 0.1%
4974001
 
< 0.1%
4958001
 
< 0.1%
4956001
 
< 0.1%
4955002
 
< 0.1%
4944001
 
< 0.1%

ocean_proximity
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
<1H OCEAN
4418 
INLAND
3165 
NEAR OCEAN
1284 
NEAR BAY
1131 
ISLAND
 
2

Length

Max length10
Median length9
Mean length8.0652
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNEAR OCEAN
2nd rowINLAND
3rd row<1H OCEAN
4th rowINLAND
5th row<1H OCEAN

Common Values

ValueCountFrequency (%)
<1H OCEAN4418
44.2%
INLAND3165
31.6%
NEAR OCEAN1284
 
12.8%
NEAR BAY1131
 
11.3%
ISLAND2
 
< 0.1%

Length

2022-03-15T23:39:41.655616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-15T23:39:41.709308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ocean5702
33.9%
1h4418
26.2%
inland3165
18.8%
near2415
14.3%
bay1131
 
6.7%
island2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-03-15T23:39:38.229734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:30.726223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.689347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.487739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.258230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.148732image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.987168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.797379image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.656329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.429936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.305201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:30.798551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.767389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.559855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.345812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.226877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.066275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.870835image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.733136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.506672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.383878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:30.874995image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.844206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.635289image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.429390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.318691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.150240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.949701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.811471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.585991image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.459940image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:30.948708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.918710image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.706686image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.504389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.399730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.231315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.023708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.885729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.661487image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.538367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.026197image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.995354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.781260image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.579262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.478873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.311368image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.099563image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.961317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.739746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.622367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.108772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.079655image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.861671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.660813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.567637image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.395615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.179533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.040547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.822581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.813638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.243151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.174785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.937829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.741827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.649389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.476942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.257027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.117754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.903939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.894601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.340034image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.252613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.016788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.819328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.731249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.558049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.426399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.193483image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.982999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.977866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.525512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.331564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.101317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.995541image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.816296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.639744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.502485image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.270171image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.065650image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:39.065488image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:31.609385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:32.411341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:33.180690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.074195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:34.903851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:35.720992image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:36.580005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:37.350066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T23:39:38.149027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-03-15T23:39:41.763774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-15T23:39:41.869488image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-15T23:39:41.975457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-15T23:39:42.083974image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-15T23:39:39.191662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-15T23:39:39.349486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-03-15T23:39:39.452734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexlongitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valueocean_proximity
014486-117.2532.8630.01670.0219.0606.0202.012.4429500001.0NEAR OCEAN
12095-119.7636.7535.01607.0383.01407.0382.02.190053400.0INLAND
27522-118.2433.9140.0972.0240.0761.0225.01.468888200.0<1H OCEAN
320592-121.5839.1452.0662.0160.0520.0149.00.892855000.0INLAND
418166-122.0237.3518.01221.0255.0507.0271.05.3679228400.0<1H OCEAN
58344-118.3433.9335.01213.0284.0742.0253.04.0625159900.0<1H OCEAN
62433-119.6336.6033.01589.0294.01102.0307.01.967662400.0INLAND
73111-117.6835.6230.02994.0741.01481.0581.02.145852400.0INLAND
810408-117.5833.654.02000.0422.0833.0386.05.7709190300.0<1H OCEAN
918033-121.9437.2426.02561.0388.01165.0393.07.3522363800.0<1H OCEAN

Last rows

df_indexlongitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valueocean_proximity
999015702-122.4437.7952.03785.0808.01371.0799.06.4209500001.0NEAR BAY
999119486-120.9837.6733.01433.0298.0824.0302.02.7621109100.0INLAND
99927311-118.2033.9931.01186.0387.02087.0409.01.9132154600.0<1H OCEAN
99931180-121.5439.4714.01724.0315.0939.0302.02.495253900.0INLAND
99945202-118.2833.9444.01631.0338.01197.0355.03.0788100000.0<1H OCEAN
99954806-118.3434.0249.01609.0371.0896.0389.02.5156136600.0<1H OCEAN
99967040-118.0933.9436.02762.0472.01576.0493.04.0846183400.0<1H OCEAN
99979939-122.2838.2242.0106.018.040.025.07.5197275000.0NEAR BAY
999813314-117.6534.0752.01041.0252.0558.0231.01.9236117200.0INLAND
9999134-122.1937.8328.01326.0184.0463.0190.08.2049335200.0NEAR BAY